Active covariance estimation by random sub-sampling of variables
نویسندگان
چکیده
We study covariance matrix estimation for the case of partially observed random vectors, where different samples contain different subsets of vector coordinates. Each observation is the product of the variable of interest with a $0-1$ Bernoulli random variable. We analyze an unbiased covariance estimator under this model, and derive an error bound that reveals relations between the sub-sampling probabilities and the entries of the covariance matrix. We apply our analysis in an active learning framework, where the expected number of observed variables is small compared to the dimension of the vector of interest, and propose a design of optimal sub-sampling probabilities and an active covariance matrix estimation algorithm.
منابع مشابه
A Covariance Regression Model
Classical regression analysis relates the expectation of a response variable to a linear combination of explanatory variables. In this article, we propose a covariance regression model that parameterizes the covariance matrix of a multivariate response vector as a parsimonious quadratic function of explanatory variables. The approach can be seen as analogous to the mean regression model, and ha...
متن کاملAlmost Sure Convergence Rates for the Estimation of a Covariance Operator for Negatively Associated Samples
Let {Xn, n >= 1} be a strictly stationary sequence of negatively associated random variables, with common continuous and bounded distribution function F. In this paper, we consider the estimation of the two-dimensional distribution function of (X1,Xk+1) based on histogram type estimators as well as the estimation of the covariance function of the limit empirical process induced by the se...
متن کاملEstimation of the Active Network Size of Kermanian Males
Background: Estimation of the size of hidden and hard-to-reach sub-populations, such as drug-abusers, is a very important but difficult task. Network scale up (NSU) is one of the indirect size estimation techniques, which relies on the frequency of people belonging to a sub-population of interest among the social network of a random sample of the general population. In this study, we estimated ...
متن کاملAn `∞ Eigenvector Perturbation Bound and Its Application to Robust Covariance Estimation
In statistics and machine learning, people are often interested in the eigenvectors (or singular vectors) of certain matrices (e.g. covariance matrices, data matrices, etc). However, those matrices are usually perturbed by noises or statistical errors, either from random sampling or structural patterns. One usually employs Davis-Kahan sin θ theorem to bound the difference between the eigenvecto...
متن کاملAn effect of initial distribution covariance for annealing Gaussian restricted Boltzmann machines
In this paper, we investigate an effect that the covariance of an initial distribution for annealed importance sampling (AIS) exerts on the estimation accuracy for the partition functions of Gaussian restricted Boltzmann machines (RBMs). A common choice for an AIS initial distribution is a Gaussian RBM (GRBM) with zero weight connections. Such an initial distribution does not show any covarianc...
متن کامل